Two layer video signal coding

ABSTRACT

A method of coding a video signal for transmission includes the steps of coding data representing the video signal by a base layer coding operation which includes base layer quantizer having a base layer quantization step size to provide coded video data for transmission; deriving inverse-coded video data by carrying out an inverse base layer coding operation on the coded video data; deriving difference data from the data representing the video signal and the inverse coded video data; and coding the difference data by an enhancement layer coding operation only when the energy of the difference data exceeds a variable threshold, the threshold being inversely proportional to the base layer quantization step size.

The present invention relates to the coding of video signals.

Techniques are well known for coding digitised video signals to achievedata compression and thereby reduce the bit rate required to transmitthe coded video signals. An example of such a technique is the CCITTrecommendation H. 261 Video Coding Standard which employs spatial andtemporal redundancies in a video coding process to achieve datacompression. Such redundancies vary with picture content and hence thelevel of data compression and the resultant required bit rate also vary.To facilitate operation with fixed or constant bit rate transmissionchannels buffering of the coded video data takes place. However thisbuffering is insufficient to cope with large and rapid variations indata rate as experienced, for example, with a scene change or as aresult of motion in the picture. In such circumstances parameters of thecoding process are adjusted to reduce the coded data rate. There is,however, a resultant reduction in picture quality. One form of parametercontrol involves adjusting the step size of the quantisation stage ofthe coding process in relation to the fullness of the buffer. Theoverall result is that for fixed rate transmission channels the picturequality is variable, with coding distortions being particularly visibleat some times where at other times channel capacity may be wastedbecause there are few changes to be transmitted.

The prospect of asynchronous transfer mode (ATM) networks such asbroadband ISDN, CCITT recommendation I121, offers the possibility ofvariable bit rate transmission channels with potential benefits for thetype of video coding just described. A first approach might be todispense with the buffering of the coded video data and to exploit thevariable bit rate channel of an ATM network to cope with the variablecoded data rate of the video transmission. However, ATM networks, whichwill commonly be packet or cell based, are potentially liable to packetor cell loss, and highly predictive video coding techniques would notrespond well to intermittent data loss. To overcome this problem andtake advantage of the variable bit rate transmission channels, it hasbeen proposed (N Ghanbrai, IEEE Journal of Selected Areas ofCommunication, vol 7, no 5, June 1989 pp771-781) to employ two layervideo coding with a first, base layer coding containing essential videodata and a second, enhancement layer coding containing the differencebetween input data and the result of the first layer coding, see FIG. 1.The coded data from the base layer coding can be sent via a constant bitrate (CBR) transmission channel with "guaranteed" packets and coded datafrom the enhancement layer can be transmitted over a variable bit rate(VBR) transmission channel. In the event that packets or cells are lostfrom the enhancement data on the VBR channel, a minimum picture qualitywill be maintained by the base layer data sent over the CBR channel. Itis an object of the present invention to provide an improved method ofcoding video signals.

According to the present invention a method of coding a video signal fortransmission, comprises:

coding data representing the video signal by a base layer codingoperation which includes a base layer quantization step size to providecoded video data for transmission;

deriving inverse-coded video data by carrying out an inverse base layercoding operation on the coded video data;

deriving difference data from the data representing the video signal andthe inverse coded video data; and

coding the difference data by an enhancement layer coding operation;

characterised in that the difference data is coded only when the energyof the difference data exceeds a variable threshold, the threshold beinginversely proportional to the base layer quantization step size.

According to a further aspect of the present invention apparatus forencoding a video signal for transmission, comprises:

means for coding data representing the video signal by a base layercoding operation which includes a base layer quantizer having a baselayer quantization step size to provide coded video data fortransmission;

means for deriving inverse-coded video data by carrying out an inversebase layer coding operation on the coded video data;

means for deriving difference data from the data representing the videosignal and the inverse coded video data; and

means for coding the difference data by an enhancement layer codingoperation;

characterised in that the means for coding the difference data operatesonly when the energy of the difference data exceeds a variablethreshold, the threshold being inversely proportional to the base layerquantization step size.

A preferred embodiment of the invention will now be described by way ofexample and with reference to the accompanying drawings, wherein:

FIG. 1 is a schematic diagram of a two layer video coding process;

FIG. 2 is a schematic diagram of an H.261 video encoder modified inaccordance with an embodiment of the invention;

FIG. 3 is a schematic diagram of an H.261 video decoder modified inaccordance with an embodiment of the invention; and

FIG. 4 is a graph illustrating the reduced signal to noise ratioobtainable with the present invention.

Referring generally to FIGS. 2 and 3, an embodiment of the inventionwill be described as a modification to the CCITT H.261 coding process asexemplified by the video encoder and decoder of FIGS. 2 and 3,respectively, the invention is applicable to other coding schemes andthe H.261 standard is chosen to illustrate the principles of theembodiments of the invention, and is not chosen by way of limitation.The parts of the embodiment of the invention illustrated in FIGS. 2 and3 corresponding to the H. 261 standard are shown contained within brokenline boxes. As these parts of the encoder and decoder are well knownthey will not be described in detail.

Referring first to FIG. 2 encoding of video input data takes placeaccording to the H. 261 standard to provide a base layer of coded videodata for transmission over a CBR channel. From the H. 261 coding processthe coded data after DCT coding and prior to quantisation is extracted.In the H. 261 coding process the DCT coded data is quantised byquantiser 2 controlled by control 4 for transmission and this quantiseddata is also inverse quantised for use in subsequent coding steps. Theextracted DCT coded data is substracted from the inverse quantised datato produce variable difference data. The variable difference data isprocessed for transmission over a variable bit rate transmissionchannel, as will now be described.

The variable difference data is selectively coupled by switch (SW1) to afixed quantiser and variable length coder (Q&VLC) and hence via amultiplexer (Mux) and a line interface (LI) to a variable bit rate (VBR)data channel. The switch (SW1) is controlled by a threshold detector(TH) which receives inputs from the quantiser controller 4 of the H.261coder and an energy determinator (ED) which operates on the variabledifference data. In addition a control circuit (C) operates on thequantiser (Q) of the H.261 coder.

The encoding process of the preferred embodiment will now be describedin more detail. The energy of the variable difference data is determinedby the energy detector (ED) on a block (8 by 8 PEL) as the sum of thesquares of the DCT coefficients of the variable difference data. Thecalculated block energy (BE) is compared with a threshold level (TL) bythreshold detector (TH) which has its threshold set on the basis of thestep size of the quantiser (Q) of the H.261 coder. The threshold level(TL) is set as

    TL=K (base quantiser step size)

where K is a constant. If the block energy (BE) is greater than thethreshold level (TL), then the switch (SW1) is operated so that thevariable difference data for that block is processed and transmitted asfollows.

Variable difference data received by quantiser and variable length coder(Q and VLC) is quantised at a fixed small step level and coded using avariable length coding, e.g. 2D-VLC coding. The quantised and codedvariable difference data is passed via multiplexer (Mux), which addsaddressing information, and line interface (LI) to a VBR channel, of forexample an ATM network.

The process just described results in data for blocks of variabledifference data having an energy greater than the threshold level beingcoded and transmitted over the VBR channel. Thus, blocks withsignificant changes in them are transmitted whereas blocks with smallerchanges are not. FIG. 4 is a graph showing the comparison of SNR forblocks greater than the variable threshold in the second layer to nodecision on blocks in the second layer. The mean bit rate for the secondlayer has dropped to 31315 bits/s a saving of 33% on the 2-layer modelwithout any thresholding. The mean SNR has dropped to 39.93 dBs (a dropof 0.34 dBs.) The spread is 1.4 dBs.

Data for blocks with instantaneous energy levels below the thresholdlevel will not be transmitted and thus small changes, for example inbackground detail, may not ever be transmitted as enhancement data. Suchsmall changes may occur at a low rate and gradually an error may buildbetween the "true" image and that encoded and transmitted.

To overcome this problem the step size of the quantiser (Q) of the H.261 encoder is fixed to the same step size as the quantiser in theenhancement layer encoder for part of each frame of an input videoimage. Thus, an image is notionally divided into 12 groups of blocks(GOBs) and over a sequence of frames the quantiser of the H. 261 encoderbe set to the fixed step size of the enhancement coder quantiser foreach of the GOBs in turn. This results in the coding in the base layerof more data than usual for the GOB selected in a particular frame, andno data will be encoded in the enhancement layer because the thresholdlevel of the enhancement coder will become very high, while at the sametime the energy of the variable difference data of the selected GOB willbe low. With the quantiser in the H. 261 coder having a small step sizethe quantisation errors will be small and the errors will be less thantheir quantiser step size in the enhancement layer. The result will be afall in the instantaneous coded data rate in the enhancement layer andan increase in the instantaneous coded data rate in the base layer,though of course because of the buffering in the base layer the constantbit rate of that layer is maintained. By means of this process ofselectively forcing small step size quantisation of GOBs of a picture inturn, any changes in the picture with energy levels too low to be pickedup in the enhancement layer coding will periodically be mopped up in thebase layer coding.

The encoding process has been described in relation to the encoder ofFIG. 2. The decoding process is essentially the reverse of the encodingprocess and therefore will only be described in general terms withreference to FIG. 3 which illustrates a decoder of the preferredembodiment. Thus, a received signal from the CBR channel is decoded byan H. 261 decoder circuit in a conventional way while the enhancementdata on the VBR channel is processed by line interface (LI) and analysedby cell loss detector (CLD) to determine if any data cells have beenlost in transmission. The received data is demultiplexed and variablelength decoded (DMux & VLD) before being inverse (fixed-step size)quantised and DCT. The resulting decoded variable data is summed withthe output of the H. 261 decoder to provide a digital video outputsignal. To take account of cell loss the demultiplexer of theenhancement layer is synchronised with the demultiplexer of the base, H.261 layer.

The preferred embodiment has been described in relation to post DCTdifferencing to establish the variable difference data transmission. Thedifferencing may occur prior to such coding in the PEL domain so thatthere the variable difference data would reflect both the quantisationand transformation errors.

As an alternative to the method of determining block energy described inthe preferred embodiment, the block energy may be determined as the sumof the absolute differences, is the sum of the absolute values of thecoefficients of the enhancement layer data.

We claim:
 1. A method of coding a video signal for transmission, comprising the steps of:coding data representing the video signal by a base layer coding operation which includes a base layer quantization step size to provide coded video data for transmission; deriving inverse-coded video data by carrying out an inverse base layer coding operation on the coded video data; deriving difference data from the data representing the video signal and the inverse-coded video data; and coding the difference data by an enhancement layer coding operation; characterised in that the difference data is coded only when the energy of the difference data exceeds a variable threshold, the threshold being inversely proportional to the base layer quantization step size.
 2. A method as claimed in claim 1 in which the enhancement layer coding operation includes quantization.
 3. A method as claimed in claim 2 in which the base layer quantization step size is selectively set to be the same as an enhancement layer quantization step size for data representing a part of an image of the video signal.
 4. A method as claimed in claim 3 in which the image is divided into a series of sections and for each image of the video signal data, one of the sections of the image is processed with the quantization step size of the base layer coding operation set to be the same as the enhancement layer quantization step size.
 5. A method as claimed in claim 1 in which the data representing the video signal is itself a coded representation of the video signal.
 6. A method as claimed in claim 5 in which the data representing the video signal is a discrete cosine-transform coding of a representation of the video signal.
 7. An apparatus for encoding a video signal for transmission, comprising:means for coding data representing the video signal by a base layer coding operation which includes a base layer quantizer having a base layer quantization step size to provide coded video data for transmission; means for deriving inverse-coded video data by carrying out an inverse base layer coding operation on the coded video data; means for deriving difference data from the data representing the video signal and the inverse-coded video data; and means for coding the difference data by an enhancement layer coding operation; characterised in that the means for coding the difference data operates only when the energy of the difference data exceeds a variable threshold, the threshold being inversely proportional to the base layer quantization step size. 